91 research outputs found

    Mapping diversity indices: not a trivial issue

    Get PDF
    Mapping diversity indices, that is estimating values in all locations of a given area from some sampled locations, is central to numerous research and applied fields in ecology. Two approaches are used to map diversity indices without including abiotic or biotic variables: (i) the indirect approach, which consists in estimating each individual species distribution over the area, then stacking the distributions of all species to estimate and map a posteriori the diversity index, (ii) the direct approach, which relies on computing a diversity index in each sampled locations and then to interpolate these values to all locations of the studied area for mapping. For both approaches, we document drawbacks from theoretical and practical viewpoints and argue about the need for adequate interpolation methods. First, we point out that the indirect approach is problematic because of the high proportion of rare species in natural communities. This leads to zero-inflated distributions, which cannot be interpolated using standard statistical approaches. Secondly, the direct approach is inaccurate because diversity indices are not spatially additive, that is the diversity of a studied area (e.g. region) is not the sum of the local diversities. Therefore, the arithmetic variance and some of its derivatives, such as the variogram, are not appropriate to ecologically measure variation in diversity indices. For the direct approach, we propose to consider the -diversity, which quantifies diversity variations between locations, by the mean of a -gram within the interpolation procedure. We applied this method, as well as the traditional interpolation methods for comparison purposes on different faunistic and floristic data sets collected from scientific surveys. We considered two common diversity indices, the species richness and the Rao\u27s quadratic entropy, knowing that the above issues are true for complementary species diversity indices as well as those dealing with other biodiversity levels such as genetic diversity. We conclude that none of the approaches provided an accurate mapping of diversity indices and that further methodological developments are still needed. We finally discuss lines of research that may resolve this key issue, dealing with conditional simulations and models taking into account biotic and abiotic explanatory variables

    Arbitration policies for on-demand user-level I/O forwarding on HPC platforms

    Get PDF
    I/O forwarding is a well-established and widely-adopted technique in HPC to reduce contention in the access to storage servers and transparently improve I/O performance. Rather than having applications directly accessing the shared parallel file system, the forwarding technique defines a set of I/O nodes responsible for receiving application requests and forwarding them to the file system, thus reshaping the flow of requests. The typical approach is to statically assign I/O nodes to applications depending on the number of compute nodes they use, which is not always necessarily related to their I/O requirements. Thus, this approach leads to inefficient usage of these resources. This paper investigates arbitration policies based on the applications I/O demands, represented by their access patterns. We propose a policy based on the Multiple-Choice Knapsack problem that seeks to maximize global bandwidth by giving more I/O nodes to applications that will benefit the most. Furthermore, we propose a user-level I/O forwarding solution as an on-demand service capable of applying different allocation policies at runtime for machines where this layer is not present. We demonstrate our approach's applicability through extensive experimentation and show it can transparently improve global I/O bandwidth by up to 85% in a live setup compared to the default static policy.This study was financed by the Coordenação de Aperfeiçoamento de Pessoal de Nível Supenor - Brasil (CAPES) - Finance Code 001. It has also received support from the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Brazil. It is also partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grants PID2019-107255GB; and the Generalitat de Catalunya under contract 2014-SGR-1051. The authors thankfully acknowledge the computer resources, technical expertise and assistance provided by the Barcelona Supercomputing Center. Experiments presented in this paper were carried out using the Grid’5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).Peer ReviewedPostprint (author's final draft

    TWINS: Server Access Coordination in the I/O Forwarding Layer

    Get PDF
    International audienceThis paper presents a study of I/O scheduling techniques applied to the I/O forwarding layer. In high-performance computing environments, applications rely on parallel file systems (PFS) to obtain good I/O performance even when handling large amounts of data. To alleviate the concurrency caused bythousands of nodes accessing a significantly smaller number of PFS servers, intermediate I/O nodes are typically applied between processing nodes and the file system. Each intermediate node forwards requests from multiple clients to the system, a setup which gives this component the opportunity to perform optimizations like I/O scheduling. We evaluate scheduling techniques that improve spatiality and request size of the access patterns. We show they are only partially effective because the access pattern is not the main factor for read performance in the I/O forwarding layer. A new scheduling algorithm, TWINS, is presented to coordinate the access of intermediate I/O nodes to the data servers. Our proposal decreases concurrency at the data servers, a factor previously proven to negatively affect performance. The proposed algorithm is able to improve read performance from shared files by up to 28% over other scheduling algorithms and by up to 50% over not forwarding I/O

    Energy Efficiency and I/O Performance of Low-Power Architectures

    Get PDF
    International audienceThis paper presents an energy efficiency and I/O performance analysis of low-power archi-tectures when compared to conventional architectures, with the goal of studying the viability of using them as storage servers. Our results show that despite the fact the power demand of the storage device amounts for a small fraction of the power demand of the whole system, significant increases in power demand are observed when accessing the storage device. We investigate the access pattern impact on power demand, looking at the whole system and at the storage device by itself, and compare all tested configurations regarding energy efficiency. Then we extrapolate the conclusions from this research to provide guidelines for when considering the replacement of traditional storage servers by low-power alternatives. We show the choice depends on the expected workload, estimates of power demand of the systems, and factors limiting performance. These guidelines can be applied for other architectures than the ones used in this work

    Arbitration Policies for On-Demand User-Level I/O Forwarding on HPC Platforms

    Get PDF
    International audienceI/O forwarding is a well-established and widelyadopted technique in HPC to reduce contention in the access to storage servers and transparently improve I/O performance. Rather than having applications directly accessing the shared parallel file system, the forwarding technique defines a set of I/O nodes responsible for receiving application requests and forwarding them to the file system, thus reshaping the flow of requests. The typical approach is to statically assign I/O nodes to applications depending on the number of compute nodes they use, which is not always necessarily related to their I/O requirements. Thus, this approach leads to inefficient usage of these resources. This paper investigates arbitration policies based on the applications I/O demands, represented by their access patterns. We propose a policy based on the Multiple-Choice Knapsack problem that seeks to maximize global bandwidth by giving more I/O nodes to applications that will benefit the most. Furthermore, we propose a userlevel I/O forwarding solution as an on-demand service capable of applying different allocation policies at runtime for machines where this layer is not present. We demonstrate our approach's applicability through extensive experimentation and show it can transparently improve global I/O bandwidth by up to 85% in a live setup compared to the default static policy

    Towards On-Demand I/O Forwarding in HPC Platforms

    Get PDF
    International audienceI/O forwarding is an established and widely-adopted technique in HPC to reduce contention and improve I/O performance in the access to shared storage infrastructure. On such machines, this layer is often physically deployed on dedicated nodes, and their connection to the clients is static. Furthermore, the increasingly heterogeneous workloads entering HPC installations stress the I/O stack, requiring tuning and reconfiguration based on the applications' characteristics.t Nonetheless, it is not always feasible in a production system to explore the potential benefits of this layer under different configurations without impacting clients. In this paper, we investigate the effects of I/O forwarding on performance by considering the application's I/O access patterns and system characteristics. We aim to explore when forwarding is the best choice for an application, how many I/O nodes it would benefit from, and whether not using forwarding at all might be the correct decision. To gather performance metrics, explore, and understand the impact of forwarding I/O requests of different access patterns, we implemented FORGE, a lightweight I/O forwarding layer in user-space. Using FORGE, we evaluated the optimal forwarding configurations for several access patterns on MareNostrum 4 (Spain) and Santos Dumont (Brazil) supercomputers. Our results demonstrate that shifting the focus from a static system-wide deployment to an on-demand reconfigurable I/O forwarding layer dictated by application demands can improve I/O performance on future machines

    Adaptive Request Scheduling for the I/O Forwarding Layer using Reinforcement Learning

    Get PDF
    International audienceI/O optimization techniques such as request scheduling can improve performance mainly for the access patterns they target, or they depend on the precise tune of parameters. In this paper, we propose an approach to adapt the I/O forwarding layer of HPC systems to the application access patterns by tuning a request scheduler. Our case study is the TWINS scheduling algorithm, where performance improvements depend on the timewindow parameter, which depends on the current workload. Our approach uses a reinforcement learning technique – contextual bandits – to make the system capable of learning the best parameter value to each access pattern during its execution, without a previous training phase. We evaluate our proposal and demonstrate it can achieve a precision of 88% on the parameter selection in the first hundreds of observations of an access pattern. After having observed an access pattern for a few minutes (not necessarily contiguously), we demonstrate that the system will be able to optimize its performance for the rest of the life of the system (years)

    SMART: An Application Framework for Real Time Big Data Analysis on Heterogeneous Cloud Environments

    Get PDF
    International audienceThe amount of data that human activities generate poses a challenge to current computer systems. Big data processing techniques are evolving to address this challenge, with analysis increasingly being performed using cloud-based systems. Emerging services, however, require additional enhancements in order to ensure their applicability to highly dynamic and heterogeneous environments and facilitate their use by Small & Medium-sized Enterprises (SMEs). Observing this landscape in emerging computing system development, this work presents Small & Medium-sized Enterprise Data Analytic in Real Time (SMART) for addressing some of the issues in providing compute service solutions for SMEs. SMART offers a framework for efficient development of Big Data analysis services suitable to small and medium-sized organizations, considering very heterogeneous data sources, from wireless sensor networks to data warehouses, focusing on service composability for a number of domains. This paper presents the basis of this proposal and preliminary results on exploring application deployment on hybrid infrastructure

    ELO e TRI : estimando a habilidade dos estudantes em uma plataforma online de programação

    Get PDF
    Métodos de avaliação da proficiência dos estudantes vem ganhando destaque nos últimos anos. Existe um número crescente de cursos online e plataformas que disponibilizam repositórios de questões ou exercícios onde os métodos de avaliação ocorrem de forma automática. Esse trabalho faz uma análise dos dados gerados através de dois modelos que tem por objetivo estimar a habilidade dos estudantes: ELO e TRI. O ELO foi desenvolvido para classificar jogadores através do histórico de jogo enquanto a TRI estima a habilidadeatravés de um conjunto de respostas dadas a um conjunto de itens. Utilizamos uma base de dados disponibilizada por uma plataforma Online Judge do Brasil. Os resultados obtidos nos apontam diferenças entre os modelos em relação às habilidades estimadas, diferenças que acreditamos estar relacionadas à forma com que cada modelo estima os parâmetros
    • …
    corecore